A Purposeful Selection of Variables Macro for Logistic Regression

نویسندگان

  • Zoran Bursac
  • C. Heath Gauss
  • D. Keith Williams
  • David Hosmer
چکیده

The main problem in any model-building situation is to choose from a large set of covariates those that should be included in the “best” model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms embedded in SAS PROC LOGISTIC. Those methods are mechanical and as such carry some limitations. Hosmer and Lemeshow describe a purposeful selection of covariates algorithm within which an analyst makes a variable selection decision at each step of the modeling process. In this paper we introduce a macro, %PurposefulSelection, which automates that process. The macro is based on the following algorithm: (1) fit a univariate model with each covariate, (2) select as candidates for a multivariate model those significant at some chosen alpha level, (3) identify those variables that are not significant in the multivariate model at some arbitrary alpha level, (4) fit a reduced model and evaluate confounding by change in parameter estimates, (5) repeat steps 3 and 4 until the model contains significant covariates and/or confounders and (6) add back in the model, one at a time, any variable not originally selected, keep any that are significant, and reduce the model following steps 3 and 4. At the end of step 6, the analyst will have a “main effects model.” Performance of the macro is illustrated with the application to the Hosmer and Lemeshow Worchester Heart Attack Study (WHAS) data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Purposeful Selection of Variables in Logistic Regression: Macro and Simulation Results

The main problem in any model-building situation is to choose from a large set of covariates those that should be included in the “best” model. A decision to keep a variable in the model might be based on the clinical or statistical significance. There are several variable selection algorithms embedded in SAS PROC LOGISTIC. Those methods are mechanical and as such carry some limitations. Hosmer...

متن کامل

Augmented Backward Elimination: A Pragmatic and Purposeful Way to Develop Statistical Models

Statistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only partial subject-matter knowledge is available. Therefore, selecting the most suitable variables for a...

متن کامل

A SAS Macro for Hosmer and Lemeshow’s Purposeful Selection Model Building Algorithm: Description and Performance

A common problem in many model-building situations is to choose from a large set of covariates that should be included in the “best” model. An additional consideration in modeling epidemiological data is the inclusion of confounders, which adds a quirk in the modeling procedure in that statistical significance is not the main criteria for keeping predictors in a model. Hosmer and Lemeshow (2000...

متن کامل

Effects of Multicollinearity in All Possible Mixed Model Selection

The effects of multicollinearity in all possible model selection of fixed effects including quadratic and cross products in the presence of random and repeated measures effects are presented here. The user-friendly SAS macro application ALLMIXED2 complements the model selection option currently available in the SAS macro applications ‘REGDIAG’ and ‘LOGISTIC’ for multiple linear and logistic reg...

متن کامل

Credit Risk Measurement of Trusted Customers Using Logistic Regression and Neural Networks

The issue of credit risk and deferred bank claims is one of the sensitive issues of banking industry, which can be considered as the main cause of bank failures. In recent years, the economic slowdown accompanied by inflation in Iran has led to an increase in deferred bank claims that could put the country's banking system in serious trouble. Accordingly, the current paper presents a prediction...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007